A Statistics-guided Progressive Rast Algorithm for Peak Template Matching in Gcxgc
نویسندگان
چکیده
Comprehensive two-dimensional gas chromatography (GCxGC) is an emerging technology for chemical separation. Chemical identification is one of the critical tasks in GCxGC analysis. Peak template matching is a technique for automatic chemical identification. Peak template matching can be formulated as a point pattern matching problem. This paper proposes a progressive RAST algorithm to solve the problem. Search space pruning techniques based on peak location distributions and transformation distributions are also investigated for guided search. Experiments on seven real data sets indicate that the new techniques are effective. 1. PEAK TEMPLATE MATCHING FOR GCXGC Comprehensive two-dimensional gas chromatography (GCxGC) is an emerging technology for chemical separation that provides an order-of-magnitude increase in separation capacity over traditional GC [1]. Given a chemical sample, the output data of GCxGC can be represented, visualized, and processed as an image. In the image, each resolved chemical substance produces a small peak or cluster of pixels with values that are larger than the background values. The objective of GCxGC analysis is to produce an accurate report on the observed chemicals and their quantity in a sample. The major image analysis tasks include: 1. Separating individual peaks from background, 2. Quantifying each peak, and 3. Identifying the chemicals for peaks of interest. GCxGC images contain potentially thousands of peaks in complex patterns, making chemical identification a challenging problem. Manual identification of chemicals is tedious and timeconsuming. An alternative is to use peak template matching. A peak template is a set of peaks with known chemical names and other characteristics (e.g., whether a chemical is an internal standard). Simple templates are created through interactive annotation. Template matching tries to establish as many correspondences as possible from peaks in the template to peaks in the target peak set. After correspondences are established, the information (e.g., chemical name) carried by the peaks in the template is copied into the corresponding peaks in the target peak set. Consequently, all the matched chemicals in the target peak set are identified. ∗This material is based upon work supported by the National Science Foundation under Grant No. 0231746. Two types of information are associated with template peaks: computed features (peak location, area, volume, shape, etc) and annotated information (chemical names, chemical group names, etc). Typically, only computed features are used for matching. In this paper, we consider peak location (the coordinates of the pixel with the largest value within the peak) as the primary feature for matching. In such a case, the peak template matching problem becomes a point pattern matching problem. The problem is formalized as follows: Given point template (template point set) P = {pi(xi, yi)}mi=1, target point set Q = {qi(ui, vi)}ni=1, and a transformation space T , find a transformation t in T that maximizes the number of points in P that can be matched with points in Q. 2. RECOGNITION BY ADAPTIVE SUBDIVISIONS OF TRANSFORMATION SPACE A wide variety of techniques have been developed for solving point pattern matching problems, including searching matching space [2], alignment [3], Hough transforms [4], geometric hashing (also called pose clustering) [5], minimizing Hausdorff distance [6, 7], computational geometry [8], etc. Recognition by Adaptive Subdivision of Transformation Space (RAST) is another family of algorithms [9, 10]. The fundamental idea of RAST is hierarchical searching for a globally optimal solution in the transformation space. With RAST, each pair of a template point and a target point defines a constraint set (a region) in the transformation space, containing the transformations that can match the two points within some distance tolerance. Two constraint sets are called compatible if their template points are different. To find a transformation that matches k template points, RAST finds a point in transformation space where k compatible constraint sets overlap. RAST starts with some initial region (usually rectangular) in the transformation space, and computes which of the O(mn) constraint sets intersect the region. If enough compatible ones do, it then subdivides the region and repeats the calculation on each of the subregions. Otherwise, it rejects the region. The algorithm accepts a region if it intersects enough constraint sets and its size is smaller than some preset threshold. The information about the transformation space is implicitly hard-coded in RAST algorithms. This paper uses constrained global affine transformations. The constrained global affine transformation from p(xp, yp) to q(uq, vq) is: [ uq vq ] = [ sx hx(= 0.0) hy sy ] [ xp yp ]
منابع مشابه
Mcmc-based Peak Template Matching for Gcxgc
Comprehensive two-dimensional gas chromatography (GCxGC) is a new technology for chemical separation. Peak template matching is a technique for automatic chemical identification in GCxGC analysis. Peak template matching can be formulated as a Largest Common Point Set problem (LCP). Minimizing Hausdorff distances is one of the many techniques proposed for solving the LCP problem. This paper prop...
متن کاملA New RSTB Invariant Image Template Matching Based on Log-Spectrum and Modified ICA
Template matching is a widely used technique in many of image processing and machine vision applications. In this paper we propose a new as well as a fast and reliable template matching algorithm which is invariant to Rotation, Scale, Translation and Brightness (RSTB) changes. For this purpose, we adopt the idea of ring projection transform (RPT) of image. In the proposed algorithm, two novel s...
متن کاملInformatics for cross-sample analysis with comprehensive two-dimensional gas chromatography and high-resolution mass spectrometry (GCxGC-HRMS).
This paper describes informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography (GCxGC) and high-resolution mass spectrometry (HRMS). GCxGC-HRMS analysis produces large data sets that are rich with information, but highly complex. The size of the data and volume of information requires automated processing for comprehensive cross-sample analysis, but the complex...
متن کاملEvaluation of Similarity Measures for Template Matching
Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...
متن کاملImplicit Manipulation of Constraint Sets for Geometric Matching under 2d Translation and Rotation
This paper presents a new algorithm in the RAST family of algorithms. RAST algorithms perform geometric matching by exploring intersections between query regions and constraint sets in the space of possible model transformations. RAST algorithms are closely related to hierarchical Hough transformations but have more desirable geometric and combinatorial properties for object recognition applica...
متن کامل